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Clustering is the unsupervised classification of patterns (observations, data items, or 
feature vectors) into groups (clusters). The clustering problem has been addressed in many 
contexts and by researchers in many disciplines; this reflects its broad appeal and 
usefulness as one of the steps in exploratory data analysis. However, clustering is a difficult 
problem combinatorially, and differences in assumptions and contexts in different 
communities has made the transfer of useful generic co ... 



Keywords: cluster analysis, clustering applications, exploratory data analysis, incremental 
clustering, similarity indices, unsupervised learning 
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Guowei Zu, Wataru Ohyama, Tetsushi Wakabayashi, Fumitaka Kimura 

November 2003 Proceedings of the 2003 ACM symposium on Document engineering 

Full text available: || ] Pdf(136.78 KB) Additional Information: full citation , abstract , references, index terms 

In this paper, we describe a comparative study on techniques of feature transformation and 
classification to improve the accuracy of automatic text classification. The normalization to 
the relative word frequency, the principal component analysis (K-L transformation) and the 
power transformation were applied to the feature vectors, which were classified by the 
Euclidean distance, the linear discriminant function, the projection distance, the modified 
projection distance and the SVM. 

Keywords: automatic text classification, principal component analysis, variable 
transformation 
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John M. Morris 

June 1980 ACM SIGLASH Newsletter, volume 13 issue 2 
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Over the past eight years we have developed statistical methods for characterizing, 
classifying, and retrieving brief natural-language messages. Our goal was to provide a tool 
for people who had to deal with enormous numbers of heterogeneous documents, using ill- 
defined criteria of relevance and interest. Initially, we worked with a large, general-purpose 
system, the On-Line Pattern Analysis and Recogition System (OLPARS). More recently, we 
have developed a system called Message Extraction Throu ... 

in search of information in visual media 

Amarnath Gupta, Simone Santini, Ramesh Jain 

December 1997 Communications of the ACM. Volume 40 Issue 12 

Full text available: ^.pdff 1..58JMB). Additional Information: MLcjtation, references, citings, index terms 



5 Applications: Fast retrieval of high-dimensional feature vectors in P2P networks using 
compact .peer dMa.summarjes 

Wolfgang Muller, Andreas Henrich 

November 2003 Proceedings of the 5th ACM SIGMM international workshop on 
Multimedia information retrieval 

Full text available: || ]pdfy376.Q7 KB) Additional Information: full citation, abstract , references f index terms 

The retrieval facilities of most Peer-to-Peer (P2P) systems are limited to queries based on a 
unique identifier or a small set of keywords. The techniques used for this purpose are 
hardly applicable for content-based image retrieval (CBIR) in a P2P network. Furthermore, 
we will argue that the curse of dimensionality and the high communication overhead 
prevent the adaptation of multidimensional search trees or fast sequential scan techniques 
for P2P CBIR. In the present paper we will propose two ... 

6 CompuMiQ^ 

Paul Suetens, Pascal Fua, Andrew J. Hanson 

March 1992 ACM Computing Surveys (CSUR) f Volume 24 issue l 

Full text available: W\ pdfi[6 37 MB) Additional Information: full citation , abstract, references , citings, index 
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This article reviews the available methods for automated identification of objects in digital 
images. The techniques are classified into groups according to the nature of the 
computational strategy used. Four classes are proposed: (1) the simplest strategies, which 
work on data appropriate for feature vector classification, (2) methods that match models 
to symbolic data structures for situations involving reliable data and complex models, (3) 
approaches that fit models to the photometry and ... 

Keywords: image understanding, model-based vision, object recognition 
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November 2003 Proceedings of the 2003 ACM SIGMM workshop on Biometrics methods 
and applications 

Full text available: ^.p.dfl[422.83 KB) Additional Information: fulj.cMion, abstract, references, jndex terms 

A novel feature generation scheme which combines multiclass mapping of Fisher scores and 
appearance based features for face recognition (FR) is proposed in this paper. Multi-class 
mapping of Fisher scores is based on partial derivative analysis of parameters of hidden 
Markov model (HMM), and appearance based features are obtained directed from face 
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images. Linear discriminant analysis (LDA) is used to analyze the feature vectors generated 
under this scheme. Recognition performance improvement is ... 

Keywords: Fisher score, hidden Markov model, linear discriminant analysis 



8 Content-based image retrieval for multimedia databases: Image database retrieval 
ytyizina„afc 

Mei-Ling Shyu, Shu-Ching Chen, Min Chen, Chengcui Zhang, Kanoksri Sarinnapakorn 
November 2003 Proceedings of the 1st ACM international workshop on Multimedia 



Recent research effort in Content-Based Image Retrieval (CBIR) focuses on bridging the 
gap between low-level features and high-level semantic contents of images as this gap has 
become the bottleneck of CBIR. In this paper, an effective image database retrieval 
framework using a new mechanism called the Markov Model Mediator (MMM) is presented 
to meet this demand by taking into consideration not only the low-level image features, but 
also the high-level concepts learned from the history of user's ... 

Human-machine perceptual cooperation Q 
Francis K. H. Quek, Michael C. Petro 

May 1993 Proceedings of the SIGCHI conference on Human factors in computing 
systems 

Full text available: ^pdf(972.26 KB) Additional Information: full citation, abstract, references, index terms 

The Human-Machine Perceptual Cooperation (HMPC) paradigm combines a human 
operator's high level reasoning with machine perception to solve spatio-perceptual intensive 
problems. HMPC defines two channels of interaction: the focus of attention (FOA) by which 
the user directs the attention of machine perception, and context. As the user moves the 
FOA across a display via a pointing device, a smart cursor operates proactively on the data, 
highl ... 

Keywords: document image analysis, human-computer interaction, map conversion, 
shared perception, telerobotics 



10 Database theory, technology and appiications (PTTA): Integrating similarity-based B 
queries in image DBMSs 

Solomon Atnafu, Richard Chbeir, David Coquil, Lionel Brunie 

March 2004 Proceedings of the 2004 ACM symposium on Applied computing 



Until recently, issues in image retrieval have been handled in DBMSs and in computer vision 
as separate research works. Nowadays, the trend is towards integrating the two approaches 
(content- and metadata-based) for multi-criteria image retrieval. However, most existing 
works and proposals in this domain lack a formal framework to deal with a multi-criteria 
query. In this paper, we introduce a formal framework to address this subject of image 
retrieval under an ORDBMS model. We first propose an ... 

Keywords: image DBMS, multi-criteria retrieval, multimedia algebra 
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This paper presents a comparative study and survey of model-based object-recognition 
algorithms for robot vision. The goal of these algorithms is to recognize the identity, 
position, and orientation of randomly oriented industrial parts. In one form this is 
commonly referred to as the "bin-picking" problem, in which the parts to be recognized are 
presented in a jumbled bin. The paper is organized according to 2-D, 2V2-D, and 3-D object 
representations, which are used as the basis for ... 



1 2 Graph-based hierarchical conceptual clustering 
Istvan Jonyer, Diane J. Cook, Lawrence B. Holder 
March 2002 The Journal of Machine Learning Research, Volume 2 

Full text available: pdf(228.03 KB) Additional Information: full citation, abstract, references, index terms 

Hierarchical conceptual clustering has proven to be a useful, although under-explored, data 
mining technique. A graph-based representation of structural information combined with a 
substructure discovery technique has been shown to be successful in knowledge discovery. 
The SUBDUE substructure discovery system provides one such combination of approaches. 
This work presents SUBDUE and the development of its clustering functionalities. Several 
examples are used to illustrate the validity of the app ... 

Keywords: cluster analysis, clustering, concept formation, graph match, structural data 



3 image Retrieval from the World Wide Web; issues, Techniques, and Systems B 
M. L Kherfi, D. Ziou, A. Bernardi 

March 2004 ACM Computing Surveys (CSUR), volume 36 issue l 

Full text available: ^.pdf{29.4..1.3 KB) Additional Information: MlQitatipn., abstract, references., index terms 

With the explosive growth of the World Wide Web, the public is gaining access to massive 
amounts of information. However, locating needed and relevant information remains a 
difficult task, whether the information is textual or visual. Text search engines have existed 
for some years now and have achieved a certain degree of success. However, despite the 
large number of images available on the Web, image search engines are still rare. In this 
article, we show that in order to allow people to profi ... 

Keywords: Image-retrieval, World Wide Web, crawling, feature extraction and selection, 
indexing, relevance feedback, search, similarity 



14 Visual information retrieval 
Amarnath Gupta, Ramesh Jain 

May 1997 Communications of the ACM, volume 40 issue 5 
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Manuele Bicego, Gianluca Iacono, Vittorio Murino 

November 2003 Proceedings of the 2003 ACM SIGMM workshop on Biometrics methods 
and applications 

Full text available: ^|| .pdg427,.53 KB) Additional Information: MLcitatjpn, abstract, references, index terms 
This paper presents a new face recognition system, based on Multilevel B-splines and 
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Support Vector Machines. The idea is to consider face images as heightfields, in which the 
height relative to each pixel is given by the corresponding gray level. Such heightfields are 
approximated using Multilevel B-Splines, and the coefficients of approximation are used as 
features for the classification process, which is performed using Support Vector Machines. 
The proposed approach was thoroughly tested, usi ... 

Keywords: Multi Level B-splines, Support Vector Machines, face recognition 



16 A model of multimedia information retrieval 
Carlo Meghini, Fabrizio Sebastiani, Umberto Straccia 
September 2001 Journal of the ACM (JACM), volume 48 issue 5 

Full text available: m odft5.69 MB» Add i tio n a! Information: Miration, abstract, references, citings, index 
^ v terms 

Research on multimedia information retrieval (MIR) has recently witnessed a booming 
interest. A prominent feature of this research trend is its simultaneous but independent 
materialization within several fields of computer science. The resulting richness of 
paradigms, methods and systems may, on the long run, result in a fragmentation of efforts 
and slow down progress. The primary goat of this study is to promote an integration of 
methods and techniques for MIR by contributing a conceptual model ... 

Keywords: Description logics, fuzzy. logics, multimedia information retrieval 




17 image similarity search systems: A compact and efficient image retrieval approach 
based on border/interior pixel classification 
Renato O. Stehling, Mario A. Nascimento, Alexandre X. Falcao 

November 2002 Proceedings of the eleventh international conference on Information 
and knowledge management 

Full text available: f g) P dff890.44 KB) Additional Information: fujl citation, abstract, references, cltinss, index 
^ ' " ^ terms 

This paper presents \bic (Border/interior pixel Gassification), a compact and efficient CBIR 
approach suitable for broad image domains. It has three main components: (1) a simple 
and powerful image analysis algorithm that classifies image pixels as either border or 
interior, (2) a new logarithmic distance (dLog) for comparing histograms, and (3) a 
compact representation for the visual features extracted from images. Experimental results 
show that the BIC appro ... 

Keywords: CBIR, color histogram, content-based image retrieval, distance function, image 
analysis 



18 integrating symbolic images into a multimedia database system using classification 

and aMtiactl^ 
Aya Soffer, Hanan Samet 

December 1998 The VLDB Journal — The International Journal on Very Large Data 

Bases, Volume 7 Issue 4 
Full text available: | 1) sdf( 227.30 KB) Additional Information: full citation, abstract, index, terms 

Symbolic images are composed of a finite set of symbols that have a semantic meaning. 
Examples of symbolic images include maps (where the semantic meaning of the symbols is 
given in the legend), engineering drawings, and floor plans. Two approaches for supporting 
queries on symbolic-image databases that are based on image content are studied. The 
classification approach preprocesses all symbolic images and attaches a semantic 
classification and an associated certainty factor to each object that ... 
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Keywords: Image indexing, Multimedia databases, Query optimization, Retrieval by 
content, Spatial databases, Symbolic-image databases 



19 The space.jOih.uman 
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Brett Allen, Brian Curless, Zoran Popovic 

July 2003 ACM Transactions on Graphics (TOG), volume 22 issue 3 



We develop a novel method for fitting high-resolution template meshes to detailed human 
body range scans with sparse 3D markers. We formulate an optimization problem in which 
the degrees of freedom are an affine transformation at each template vertex. The objective 
function is a weighted combination of three measures: proximity of transformed vertices to 
the range data, similarity between neighboring transformations, and proximity of sparse 
markers at corresponding locations on the template and ... 

Keywords: deformations, morphing, non-rigid registration, synthetic actors 
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Merence^ 
Atsuhiro Takasu 

May 2003 Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries 



In this paper, we propose a method for extracting bibliographic attributes from reference 
strings captured using Optical Character Recognition (OCR) and an extended hidden Markov 
model. Bibliographic attribute extraction can be used in two ways. One is reference parsing 
in which attribute values are extracted from OCR-processed references for bibliographic 
matching. The other is reference alignment in which attribute values are aligned to the 
bibliographic record to enrich the vocabulary of the ... 
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